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Abstract A land data assimilation system (LDAS) can merge satellite observations (or 
retrievals) of land surface hydrological conditions, including soil moisture, snow, and 
terrestrial water storage (TWS), into a numerical model of land surface processes. In 
theory, the output from such a system is superior to estimates based on the observations or 
the model alone, thereby enhancing our ability to understand, monitor, and predict key 
elements of the terrestrial water cycle. In practice, however, satellite observations do not 
correspond directly to the water cycle variables of interest. The present paper addresses 
various aspects of this seeming mismatch using examples drawn from recent research with 
the ensemble-based NASA GEOS-5 LDAS. These aspects include (1) the assimilation of 
coarse-scale observations into higher-resolution land surface models, (2) the partitioning of 
satellite observations (such as TWS retrievals) into their constituent water cycle compo- 
nents, (3) the forward modeling of microwave brightness temperatures over land for 
radiance-based soil moisture and snow assimilation, and (4) the selection of the most 
relevant types of observations for the analysis of a specific water cycle variable that is not 
observed (such as root zone soil moisture). The solution to these challenges involves the 
careful construction of an observation operator that maps from the land surface model 
variables of interest to the space of the assimilated observations. 

Keywords Land data assimilation • Land surface modeling • Satellite remote sensing • 
Soil moisture • Snow • Terrestrial water storage • Ensemble Kalman filter 
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1 Introduction 

The water cycle plays a crucial role in Earth’ s climate and environment, yet there are still 
large gaps in our understanding of its components, particularly at the land surface (Lahoz 
and De Lannoy 2013; Trenberth and Asrar 2013). Over the past decade, there has been a 
steady increase in the number and types of satellite observations (or retrievals) related to 
land surface hydrological conditions, including soil moisture, snow, and terrestrial water 
storage (TWS; Bartalis et al. 2007; Bruinsma et al. 2010; Clifford 2010; de Jeu et al. 2008; 
Entekhabi et al. 2010; Foster et al. 2005, 2011; Gao et al. 2010; Hall and Riggs 2007; Hall 
et al. 2010; Horwath et al. 2011; Kelly 2009; Kerr et al. 2010; Li et al. 2007; Liu et al. 
2011b; Njoku et al. 2003; Parinussa et al. 2012; Pulliainen 2006; Rowlands et al. 2005, 
2010; Swenson and Wahr 2006; Tedesco and Narvekar 2010; Tedesco et al. 2010; Wahr 
et al. 2004). 

These observations can be assimilated into land surface models to provide land surface 
hydrological estimates that are generally superior to the satellite observations or model 
estimates alone (Andreadis and Lettenmaier 2006; Crow and Wood 2003; De Lannoy et al. 
2012; de Rosnay et al. 2012a, b; Draper et al. 2012; Drusch 2007; Dunne and Entekhabi 
2006; Durand and Margulis 2008; Forman et al. 2012; Houborg et al. 2012; Li et al. 2012; 
Liu et al. 2011a; Margulis et al. 2002; Pan and Wood 2006; Pan et al. 2008; Reichle and 
Koster 2005; Reichle et al. 2007, 2009; Sahoo et al. 2012; Su et al. 2008, 2010; Zaitchik 
et al. 2008). 

However, land data assimilation systems must be designed carefully such that a number 
of conceptual problems can be overcome and the potential improvements from data 
assimilation can be realized. Earlier work addressed the bias between the satellite obser- 
vations and model estimates within the assimilation system (De Lannoy et al. 2007; Drusch 
et al. 2005; Kumar et al. 2012; Reichle and Koster 2004). Moreover, approaches to effi- 
cient error modeling within the assimilation system, including adaptive methods, needed 
to be developed (Crow and Reichle 2008; Crow and van den Berg 2010; Reichle et al. 
2008a, b). An overview of some relevant earlier literature in the context of the ensemble- 
based Goddard Earth Observing System Model, Version 5 (GEOS-5) land data assimilation 
system (LDAS) developed at the NASA Global Modeling and Assimilation Office 
(GMAO) is provided by Reichle et al. (2009). 

Despite the early successes, the design and application of land data assimilation systems 
still face additional conceptual problems. While land surface models are flexible in the 
design and choice of model variables, satellite observations do not necessarily correspond 
directly to the water cycle variables of interest. For example, space-borne microwave 
observations can be converted into estimates of snow amount or surface soil moisture, but 
the spatial resolution of such microwave-based retrievals is usually much coarser than 
desired. Moreover, satellites typically observe electromagnetic properties such as back- 
scatter and/or radiances (or brightness temperatures) that are only indirectly related to 
snow amounts or soil moisture levels. Furthermore, satellite-observed backscatter and 
radiances are at best sensitive to moisture in the top few centimeters of the soil. Infor- 
mation on important water cycle components such as root zone soil moisture must 
therefore be gained through even more indirect pathways in the land data assimilation 
system. 

The present paper addresses several major challenges that all relate to a seeming 
mismatch between the assimilated observations and the water cycle variables of interest. 
This mismatch can be overcome through the careful design of the land data assimilation 
system. The conceptual challenges discussed here can be summarized as follows: 
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1. How can coarse-scale satellite observations increase our knowledge of land surface 
conditions at finer scales (horizontal downscaling), and how can unobserved areas be 
updated using information from neighboring observations? 

2. How can vertically integrated measurements (such as TWS) be partitioned into their 
component variables within the assimilation system? 

3. How can satellite radiances (rather than geophysical retrievals) be assimilated to 
improve estimates of land surface hydrological conditions (e.g., soil moisture and 
snow)? 

4. How can the most relevant types of observations be selected for the analysis of a water 
cycle component that is not observed (such as root zone soil moisture)? 

The present paper illustrates each of these conceptual problems based on recent progress 
using the GEOS-5 system for land surface hydrological data assimilation. The examples 
use satellite observations of land surface water cycle components from the Advanced 
Microwave Scanning Radiometer for EOS (AMSR-E), the Moderate Resolution Imaging 
Spectroradiometer (MODIS), the Gravity Recovery and Climate Experiment (GRACE) 
mission, the Advanced Scatterometer (ASCAT), and the Soil Moisture Ocean Salinity 
(SMOS) mission for the analysis of soil moisture (AMSR-E, ASCAT, SMOS, GRACE), 
snow (AMSR-E, MODIS, GRACE), and TWS (GRACE). After a brief discussion of the 
GEOS-5 LDAS, Sect. 2 provides details and references for the various satellite observa- 
tions used in the examples. Section 3 addresses each of the above-mentioned challenge 
questions in a separate subsection. Results are discussed and summarized in Sect. 4. 
Finally, Sect. 5 provides conclusions and a brief outlook on future research directions. 


2 Data and Methods 

2.1 GEOS-5 Land Data Assimilation System 

The GEOS-5 LDAS consists of the NASA Catchment land surface model and an imple- 
mentation of the ensemble Kalman filter (EnKF; Evensen 2003). The GEOS-5 EnKF has 
also been included in the NASA Land Information System, a comprehensive land surface 
modeling and assimilation software framework, so that it can be used with a variety of land 
surface models (Kumar et al. 2008a, b). A brief summary of the key characteristics of the 
system is provided below. For a more comprehensive discussion, see Reichle et al. (2009) 
and references therein. 

The Catchment land surface model (hereinafter Catchment model; Ducharne et al. 2000; 
Koster et al. 2000) differs from traditional, layer-based land surface models by including 
an explicit treatment of the spatial variation within each hydrological catchment (or 
computational element) of the soil water and water table depth, as well as its effect on 
runoff and evaporation. Within each element, the vertical profile of soil water down to the 
bedrock is given by the equilibrium soil moisture profile and the deviations from the 
equilibrium profile. The deviations are described by excess and deficit variables for a 
0-2 cm (or 0-5 cm) surface layer and for a “root zone” layer that extends from the surface 
to a depth zr of 75 cm < zr < 100 cm depending on local soil conditions. The spatial 
variability of soil moisture is diagnosed at each time step from the bulk water prognostic 
variables and the statistics of the catchment topography. One key feature of the Catchment 
model is the groundwater component implicit in the modeling of the water table depth 
(through the modeling of the subsurface water profile down to the bedrock). This 
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groundwater component is critically important for the assimilation of TWS retrievals 
(Sect. 3.2). 

The Catchment model also includes a state-of-the-art, multi-layer, global snow model 
(Stieglitz et al. 2001). In each watershed, the evolution of the amount of water in the snow 
pack (or snow water equivalent; SWE), the snow depth, and the snow heat content in 
response to surface meteorological conditions and snow compaction is modeled using three 
layers. The soil, vegetation, and snow model parameters used in the Catchment model are 
from the NASA GEOS-5 global modeling system (Rienecker et al. 2008). 

The EnKF is a Monte-Carlo variant of the Kalman filter, which sequentially updates 
model forecasts in response to observations based on the relative uncertainty of the model 
and the observations. The key idea behind the EnKF is that the relevant parts of the model 
error covariance structure can be captured by a small ensemble of model trajectories. Each 
member of the ensemble experiences perturbed instances of the observed forcing fields 
(representing errors in the forcing data) and/or randomly generated noise that is added to 
the model parameters and prognostic variables (representing errors in model physics and 
parameters). The model error covariance matrices that are required for the filter update can 
then be diagnosed from the ensemble at the update time. The EnKF is flexible in its 
treatment of errors in model dynamics and parameters. It is also very suitable for modestly 
nonlinear problems and has become a popular choice for land data assimilation (Andreadis 
and Lettenmaier 2006; Durand and Margulis 2008; Kumar et al. 2008a, b; Pan and Wood 
2006; Reichle et al. 2002a, b; Su et al. 2008; Zhou et al. 2006). 

To realize the potential benefits from data assimilation, the assimilation system must be 
supplied with appropriate input parameters for the description of model and observation 
errors. For an ensemble-based system such as the GEOS-5 LDAS, for example, standard 
deviations, spatial and temporal correlations, and cross-correlations must be specified for 
the perturbations that are applied to each ensemble member. A detailed discussion of the 
error parameters in the examples discussed here is beyond the scope of the paper. The 
reader is referred to the references provided with each example as well as the overview 
discussion of Reichle et al. (2009). 

2.2 Assimilated Observations 

The data assimilation examples discussed in this paper use various types of satellite 
observations from a number of polar orbiting sensors/platforms, including passive and 
active microwave observations (AMSR-E, SMOS, and ASCAT), visible and near-infrared 
observations (MODIS), and gravimetric observations (GRACE). 

AMSR-E, which operated with nominal performance between 2002 and 2011, is a 
scanning, dual polarization radiometer that measured microwave emission from the Earth 
at six frequencies (6.9, 10.7, 18.7, 23.9, 36.5, and 89.0 GHz), ranging in resolution from 
~ 50 km at 6.9 GHz to ~ 5 km at 89.0 GHz (Knowles et al. 2006). Its successor, AMSR2, 
was launched in May 2012 (http://www.jaxa.jp/projects/sat/gcom_w/index_e.html). The 
training and validation of the empirical microwave radiative transfer model for snow- 
covered land surfaces in Sect. 3.3.2 uses the 10.7, 18.7, and 36.5 GHz AMSR-E brightness 
temperatures, while the snow assimilation example in Sect. 3.1 uses SWE retrievals that 
are based on the difference between the 18.7 and the 36.5 GHz brightness temperatures 
(Kelly 2009). The soil moisture assimilation examples in Sect. 3.4 use surface (top 1 cm) 
soil moisture retrievals that are derived from the 6.9 and 10.7 GHz brightness temperatures 
(de Jeu et al. 2008; Njoku et al. 2003). 
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SMOS was launched in 2009 and its Microwave Imaging Radiometer with Aperture 
Synthesis (MIRAS) sensor provides multi-angular L-band (1.4 GHz) brightness tempera- 
ture observations at horizontal and vertical polarization and a nominal spatial resolution of 
43 km (Kerr et al. 2010). SMOS brightness temperatures are used in Sect. 3.3.1. 

ASCAT is a 5.3 GHz radar system that illuminates the Earth’s surface and measures the 
energy scattered back to the instrument. The ASCAT surface (top 1 cm) soil moisture 
retrievals used in Sect. 3.4.2 are derived from these backscatter measurements (Bartalis 
et al. 2007; Wagner et al. 1999) and are provided in units of degree of saturation. 

MODIS (2000-present) provides visible and near-infrared observations from which 
snow cover fraction (SCF) can be retrieved under clear-sky conditions (Hall and Riggs 
2007). High-resolution (500 m) MODIS SCF retrievals are in Sect. 3.1. 

Through the measurement of gravitational anomalies associated with the accumulation 
(or loss) of mass near the Earth’s surface, GRACE provides approximately monthly, basin- 
scale (>150,000 km 2 ) estimates of variations in TWS, which includes snow, ice, surface 
water, soil moisture, and groundwater (Bruinsma et al. 2010; Horwath et al. 2011; Rodell 
et al. 2009; Rowlands et al. 2005, 2010; Swenson and Wahr 2006; Tang et al. 2010; Wahr 
et al. 2004). The assimilation experiments of Sect. 3.2 use GRACE TWS retrievals. 

2.3 Validation Data and Approach 

For each of the examples presented in Sect. 3, the output from the assimilation system was 
evaluated against independent data from various sources. In Sect. 3.1, in situ SWE mea- 
surements from United States Department of Agriculture Snowpack Telemetry (SNOTEL; 
Schaefer et al. 2007) network sites in Colorado were used for evaluation, along with snow 
depth measurements from National Oceanic and Atmospheric Administration Cooperative 
Observer Program (COOP; http://www.ncdc.noaa.gov) sites. 

SWE estimates for the Mackenzie River basin, used for evaluation in Sect. 3.2, were 
derived from the daily snow depth product of the Canadian Meteorological Centre (CMC) 
daily snow analysis (Brasnett 1999; Brown and Brasnett 2010) at a horizontal resolution of 
approximately 24 km. The CMC snow analysis is based on optimal interpolation of in situ 
daily snow depth observations and aviation reports with a first-guess field generated from a 
snow model driven by output from the CMC weather model. Using the snow class map 
shown in Sturm et al. (1995), SWE estimates were obtained by multiplying the CMC snow 
depths with the Sturm et al. (2010) snow densities. Furthermore, runoff estimates for the 
Mackenzie River basin and its major sub-basins provided by the Global Runoff Data 
Center (GRDC; http://www.bafg.de/GRDC) were used in Sect. 3.2. 

The radiative transfer models of Sect. 3.3 were evaluated with AMSR-E and SMOS 
microwave brightness temperatures using a split sample approach in which one portion of 
the satellite brightness data was used for calibration or training and another, different 
portion was used for evaluation. 

In situ profile soil moisture observations used for evaluation in Sect. 3.4 are from the 
United States Department of Agriculture Soil Climate Analysis Network (SCAN)/SNO- 
TEL (Schaefer et al. 2007) network in the contiguous US and from the Murrumbidgee Soil 
Moisture Monitoring Network (Smith et al. 2012) in Australia. Both sets of measurements 
were subjected to extensive quality control steps, including automatic detection of prob- 
lematic observations and a visual inspection of the time series prior to using the data for 
evaluation. 
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Metrics used for skill assessment include the bias, root mean square error (RMSE), and 
time series correlation coefficient ( R ). When specified, anomalies were computed by 
removing a seasonally varying climatology from the data before computing the metrics. 


3 Results 

3.1 Assimilation of Sparse and Coarse-Scale Observations 

Snow is an important component of the land system because of its strong impact on the 
land surface water and energy balance, weather, climate, and water resources (Barnett et al. 
2005). However, land surface models often represent snow processes poorly. Satellite 
observations of SWE can be retrieved from passive microwave sensors, but they are only 
available at relatively coarse resolution. Moreover, SWE retrievals, like most satellite 
observations, do not provide complete spatial and continuous temporal coverage due to 
orbit or sensor limitations. The challenge is therefore to design an assimilation system that 
can use coarse-scale satellite observations to provide enhanced model estimates at the finer 
scales of interest (horizontal downscaling) and that can also propagate the information to 
intermittently unobserved areas. 

Using AMSR-E SWE retrievals and MODIS SCF observations, De Lannoy et al. (2010, 
2012) developed a data assimilation and downscaling technique for estimating fine-scale 
(1 km) snow fields using coarse-scale (25 km) SWE retrievals and fine-scale (500 m) SCF 
retrievals for a domain in Northern Colorado, USA. In their study, the authors used the LIS 
version of the GEOS-5 EnKF together with the Noah land surface model (Ek et al. 2003) 
(rather than the GEOS-5 LDAS and the Catchment model used elsewhere in this paper). 
The Noah model simulates a single snow layer with two prognostic variables for SWE and 
snow depth. The default LIS soil, vegetation, and general parameter tables for Noah were 
used, including a Noah-specific maximum snow albedo. 

Figure 1 shows schematically how the coarse-scale SWE retrievals are used. The fine- 
scale model grid is represented by the dashed lines in the figure. The coarse-scale grid of 
the SWE observations is represented by the solid lines and light/dark gray shading, and the 
center points of individual SWE retrievals are marked with crosses. Let us now consider 
the analysis update of the fine-scale model grid cell indicated by the solid black square. 
First, it is important to emphasize that the coarse-scale SWE retrievals are not compared 
directly to the SWE estimate at the fine-scale model grid cell. Rather, the model SWE is 
aggregated to the coarse grid of the retrievals, that is, the fine-scale model forecast is 
mapped into the coarse-scale observation space. This aggregation is part of the observation 
operator that maps the model states to the observations. Observation-minus-model-forecast 
residuals (or innovations ) are then computed at the coarse scale of the observation space. 
The Kalman gain matrix transforms the (observation-space) innovations into the (model- 
space) increments. It is computed from error correlations between the model states at the 
fine scale and the model-predicted measurements at the coarse scale. Finally, the incre- 
ments are added to the (fine-scale) model forecast in the analysis update. See De Lannoy 
et al. (2010) for a discussion based on equations. 

Second, multiple coarse-scale SWE retrievals in the vicinity of the fine-scale model grid 
cell in question are used for the analysis update. Specifically, the update uses the three 
coarse-scale SWE retrievals marked by black crosses that are within a given radius 
(indicated by the white semi-circle) around the fine-scale model grid cell in question 
(Fig. 1). Note that this model grid cell would be updated even if the SWE retrieval directly 
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Fig. 1 Schematic of the 
distributed ( “three-dimensional” ) 
EnKF update used for the 
assimilation of coarse-scale snow 
observations. See text for details. 
Adapted from De Lannoy et al. 
( 2010 ) 



covering it were unavailable — the two neighboring SWE retrievals (dark gray shading) 
would still contribute to the update. The connection between the neighboring SWE retri- 
evals and the model grid cell in question relies on horizontal model error correlations that 
are due to, for example, errors in large-scale model forcing fields such as snowfall or air 
temperature. 

To assimilate SCF, the Noah model snow depletion curve acts as the observation 
operator that converts fine-scale modeled SWE into SCF estimates. Unlike binary indi- 
cators of snow presence, the continuous SCF observations used here can thus be assimi- 
lated with an EnKF, taking advantage of the distribution of SCF values across the 
ensemble. Snow-free or fully snow-covered conditions in the model-forecast ensemble 
were addressed by supplementing the EnKF with rule-based update procedures (De 
Fannoy et al. 2012). If at a given time and location all members of the model-forecast 
ensemble are snow-free but the SCF observation indicates the presence of snow, then a 
nominal amount of snow is added to the model forecast. If all forecast ensemble members 
have full snow cover and the observed SCF indicates less than full cover, then the model- 
forecast SWE and snow depth are reduced by a fixed fraction. 

Figure 2 shows several observed and modeled snow fields for one snow season. The top 
row shows the coarse-scale (25 km) AMSR-E SWE retrievals, with data missing when the 
satellite swath does not fully cover the study area. MODIS fine-scale estimates of SCF, 
shown in the second row, are available only for clear-sky conditions. The bottom four rows 
of Fig. 2 show that the assimilation of coarse-scale AMSR-E SWE and fine-scale MODIS 
SCF observations both result in realistic fine-scale spatial SWE patterns. 

Through a quantitative validation of the assimilation results with independent mea- 
surements at individual SNOTEF and COOP sites over the course of 8 years, De Fannoy 
et al. (2012) demonstrate improvements from the assimilation of SWE and/or SCF retri- 
evals in shallow snow packs, but not in deep snow packs (not shown). The validation also 
shows that joint assimilation of SWE and SCF retrievals yields significantly improved 
RMSE and correlation values. For example, the RMSE for SWE versus COOP site mea- 
surements was reduced by 21 % (from 78 to 62 mm) through the joint assimilation of 
satellite SWE and SCF retrievals. Furthermore, SCF assimilation was found to improve the 
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Fig. 2 SWE and SCF fields for 6 days (MMDDYYYY) in the winter of 2009-2010 for a 75 km by 100 km 
domain (1 km resolution) in northern Colorado. Blue (white) colors indicate low (high) SWE or SCF, black 
shading indicates no snow, and orange shading indicates no data. The top two rows show SWE and SCF 
satellite observations. The remaining rows show SWE ( rows 3 and 4) and SCF ( rows 5 and 6 ) for the 
ensemble Open Loop (EnsOL) forecast (no assimilation) and the analyses obtained through data assimilation 
(DA) of SWE or SCF. Adapted from De Lannoy et al. (2012) 


timing of the onset of the snow season, albeit without a net improvement of SWE esti- 
mates. In areas of deep snow, however, AMSR-E retrievals are typically biased low and 
require bias correction (or scaling of the observations) prior to data assimilation. De 
Lannoy et al. (2012) also showed that the interannual SWE variations could not be 
improved through the assimilation of AMSR-E because the AMSR-E retrievals lack 
realistic interannual variability in deep snow packs. These deficiencies in the AMSR-E 
SWE retrievals motivated the development of the empirical microwave radiative transfer 
model (Sect. 3.3.2) toward a radiance-based snow analysis. 

Of course, horizontal downscaling is not only important for snow assimilation. Low- 
frequency passive microwave brightness temperature observations such those from AMSR- 
E and SMOS (and the corresponding soil moisture retrievals) are at the coarse resolution of 
~ 50 km. But for applications such as weather prediction, soil moisture estimates are 
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needed at hydrometeorological scales of ~ 10 km or better. Examples of soil moisture 
downscaling based on data assimilation are provided by Reichle et al. (2001), Sahoo et al. 
(2012), and Zhou et al. (2006). Also, Reichle and Koster (2003) addressed the propagation 
of observational soil moisture information to unobserved regions. 

3.2 Partitioning of Terrestrial Water Storage Observations 

Passive microwave (e.g., AMSR-E) retrievals have been used in conjunction with land 
surface models to better characterize snow (Sect. 3.1) and soil moisture (Sect. 3.4). 
Gravimetric measurements such as from GRACE can provide monthly, basin-scale 
(>150,000 km 2 ) estimates of changes in TWS (Sect. 2.2). Since TWS is vertically inte- 
grated and includes groundwater, soil moisture, snow, and surface water, TWS retrievals 
offer significant insights into the regional- and continental-scale water balance and, 
through data assimilation, the potential to learn more about hydrological processes. 

Besides the obvious spatial downscaling challenge presented by the basin-scale GRACE 
TWS retrievals, another challenge for the assimilation of GRACE-based TWS is the 
partitioning of the vertically integrated TWS retrievals into water cycle component vari- 
ables. Like the horizontal downscaling of AMSR-E SWE retrievals discussed in the pre- 
vious section, the partitioning of TWS retrievals can be accomplished through assimilation 
using an appropriate observation operator. In this case, the observation operator aggregates 
the fine-scale model estimates of soil moisture, groundwater, and snow to basin-scale TWS 
estimates. This observation operator enables the computation of the observation-minus- 
forecast residuals (or innovations) in the (basin-scale, TWS) space of the observations. The 
observation operator is also needed for the computation of the Kalman gain that transforms 
the innovations back into the space of the fine-scale model variables. Similarly, the 
required temporal aggregation of the model output to the monthly scale of the assimilated 
TWS retrievals is accomplished through the observation operator. 

This concept was illustrated by Forman et al. (2012), who assimilated GRACE TWS 
retrievals over the Mackenzie River basin located in northwestern Canada (Fig. 3) using an 
updated version of the GEOS-5 LDAS developed by Zaitchik et al. (2008). The assimi- 
lation estimates were evaluated against independent SWE and river discharge observations 
(Sect. 2.3). Results suggest improved SWE estimates, including improved timing of the 
subsequent ablation and runoff of the snow pack. For example, Fig. 4 shows the 
improvements in SWE estimates resulting from the assimilation of GRACE TWS retri- 
evals. The white bars represent model results without assimilation, whereas the gray bars 
represent results with assimilation. The labels on the y-axis of each subplot represent sub- 
basins of the Mackenzie River basin. As shown in Fig. 4, the assimilation of GRACE TWS 
retrievals generally reduced the mean difference and RMSE between the model and the 
independent CMC SWE estimates (Sect. 2.3). The reductions are greatest in the Liard 
basin, where the greatest amount of snow accumulation occurs. Here, the mean difference 
with the CMC estimates is reduced through GRACE data assimilation by 30 % (from 13.2 
to 9.3 mm) and the RMSE is reduced by 18 % (from 24 to 19.6 mm). Smaller reductions 
occur in the other sub-basins. The correlation coefficient of the SWE anomalies (not 
shown) suggests a slight degree of degradation resulting from assimilation, but further 
analysis shows there is no statistically significant difference at the 5 % level. In summary, 
the assimilation of GRACE TWS information into the Catchment land surface model 
reduces the mean difference and RMSE in SWE estimates without adversely impacting 
estimates of interannual variability. 
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Fig. 3 Map of the 1,800,000 km 9 Mackenzie River Basin including GEOS-5 topography, sub-basin 
delineation, and GRDC observation locations (solid dots). Adapted from Forman et al. (2012) 


Additional work was conducted to analyze modeled river discharge estimates against 
ground-based gauging stations. The findings (not shown) suggest that the assimilation of 
GRACE observations causes little or no change in the mean difference and RMSE of 
modeled river discharge, but that small, statistically significant improvements in the 
anomaly correlations were found. Improvements in the modeled river runoff anomalies are 
attributed to a redistribution of the water mass from the snow pack during the accumulation 
phase into the subsurface during the subsequent ablation and runoff phase. This redistri- 
bution of water by the assimilation framework effectively retains water within the 
hydrological basin for a longer period of time, which results in small but statistically 
significant improvements in modeled estimates of river discharge. 

Investigation of the analysis increments can provide valuable insights into the behavior 
of the assimilation procedure and track how much and at what time water is being added to 
or removed from the individual TWS components. The thin, solid line in Fig. 5 shows the 
increments made to the subsurface water component. Averaged over the Mackenzie River 
basin and the 7-year experiment period, a total of 12.5 mm of water has been added into 
the subsurface by the assimilation procedure. This is most evident during the spring and 
summer. The thick, dashed line in Fig. 5 shows the increments for SWE. Averaged over 
time and space, SWE is removed during the accumulation phase with a small amount 
added back during the ablation and runoff phase for a total SWE increment of —45.1 mm. 
Acting together, the analysis increments to the subsurface water and SWE serve to reduce 
mass during snow accumulation and then increase the mass during ablation and runoff. 
These two phenomena essentially constrain the amplitude of the modeled TWS dynamics 
to achieve better agreement of the model estimates with the GRACE retrievals. 
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Fig. 4 SWE statistics of a mean 
difference and b RMSE for open 
loop (OL; white) and assimilation 
(DA; light gray) of GRACE 
TWS retrievals relative to CMC 
SWE estimates via Sturm et al. 
(2010). Statistics are for the 
Mackenzie River Basin ( MRB ) 
and its sub-basins Liard ( L ), 
Peace and Athabasca ( P + A), 
Slave ( S ), and Bear and Peel 
( B + Pe) shown in Fig. 3. 
Adapted from Forman et al. 
( 2012 ) 
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Fig. 5 Analysis increments for 
the entire Mackenzie River basin 
from GRACE TWS assimilation. 
The thin, solid line represents the 
subsurface water increments, 
whereas the thick, dashed line 
represents the SWE increments. 
Adapted from Forman et al. 
( 2012 ) 



The results shown in Figs. 4 and 5 imply that the assimilation procedure can effectively 
partition the vertically integrated GRACE TWS retrievals into their snow and subsurface 
water components. Houborg et al. (2012), Li et al. (2012), Su et al. (2010), and Zaitchik 
et al. (2008) further investigated the horizontal, vertical, and temporal disaggregation of 
GRACE TWS retrievals and reached similar conclusions for other basins in North America 
and Europe in different climate zones. Collectively, the growing body of research suggests 
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that GRACE TWS assimilation can lead to better understanding of the hydrological cycle 
in remote regions of the globe where ground-based observation collection is difficult, if not 
impossible. This information could ultimately lead to improved freshwater resource 
management as well as reduced uncertainty in river discharge. 

3.3 Microwave Radiative Transfer Models for Radiance Data Assimilation 

It is well established for atmospheric data assimilation systems that the assimilation of 
satellite radiance observations is preferable to the assimilation of geophysical retrievals 
(Eyre et al. 1993; Joiner and Dee 2000). The former approach incorporates the radiative 
transfer model into the assimilation system and thereby avoids inconsistencies in the use of 
ancillary data between the assimilation system and the (pre-processed) geophysical retri- 
evals. For land data assimilation, however, the vast majority of publications assimilate 
geophysical retrievals (Lahoz and De Lannoy 2013). In this section, we discuss the 
development of forward radiative transfer models (RTMs) that convert land surface model 
variables into microwave brightness temperatures. The first example presents such a model 
for warm-season microwave brightness temperatures (Sect. 3.3.1). The second example 
introduces a neural network approach to predict microwave brightness temperatures over 
snow-covered land (Sect. 3.3.2). 

3.3.1 Warm-Season, L-Band Radiative Transfer Modeling 

Global observations of brightness temperatures (Tb) at L-band (1.4 GHz) are available 
from the SMOS mission, and similar Tb observations are expected from the planned Soil 
Moisture Active Passive (SMAP; Entekhabi et al. 2010) mission. In preparation for the 
assimilation of Tb observations from SMOS and SMAP, De Lannoy et al. (2013) added a 
physically based, warm-season microwave RTM to the GEOS-5 Catchment model. The 
RTM is based on the commonly used, zero-order “tau-omega” approach that accounts for 
microwave emission by the soil and the vegetation canopy as well as attenuation by the 
vegetation. While the RTM is based on sound physical principles, determining the required 
parameter values for the microwave roughness, scattering albedo, and vegetation optical 
depth on a global scale is a serious challenge. 

De Lannoy et al. (2013) collected three different sets of the literature values for the 
L-band RTM parameters. “Litl” refers to parameters that are proposed for the future 
SMAP radiometer retrieval product, “Lit2” are parameters collected from the literature 
studies using the L-band Microwave Emission of the Biosphere model (Wigneron et al. 
2007) and related models, and “Lit3” is the same as Lit2 except that the microwave 
roughness parameter is set to values used for SMOS monitoring in the European Centre for 
Medium-Range Weather Forecasts (ECMWF). The three sets of parameters are illustrated 
in Fig. 6, which shows the resulting microwave roughness ( h ), vegetation opacity (x), and 
scattering albedo (co) by vegetation class. As can be seen from the figure, there are large 
differences in h, x, and w between the three sets of the literature values. These differences 
translate into climatological differences in the simulated brightness temperatures. 

For example, Fig. 7a-c shows the differences between 1-year mean (July 1, 2010-July 
1, 2011) model simulations (using the three different literature-based sets of RTM 
parameters) and SMOS observations for H-polarized Tb at 42.5° incidence angle. Modeled 
brightness temperatures are at 36 km resolution, commensurate with the resolution of the 
SMOS observations. Brightness temperatures are screened for frozen soil conditions, snow 
on the ground, heavy precipitation, proximity to open water surfaces, and radio-frequency 
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Fig. 6 a Time-mean <h> (July 
1, 2010-July 1, 2011), b time- 
mean <t>, and c time-invariant 
m; {Fill, Lit2 and Lit3 ) before 
calibration, and {Cal) after 
calibration, spatially averaged by 
vegetation class. International 
Geosphere-Biosphere Program 
( IGBP ) vegetation classes are 
( ENF) Evergreen Needleleaf 
Forest, {EBF) Evergreen 
Broadleaf Forest, (DAT 7 ) 
Deciduous Needleleaf Forest, 

( DBF) Deciduous Broadleaf 
Forest, ( MXF ) Mixed Forest, 
(CSH) Closed Shrublands, {OSH) 
Open Shrublands, {WSV) Woody 
Savannas, (SAV) Savannas, 

{GRS) Grasslands, (CRP) 
Croplands, {CRN) Cropland and 
Natural Vegetation, and {BAR) 
Barren or Sparsely Vegetated. 
Thin gray lines for Cal indicate 
the spatial standard deviation 
within each vegetation class. 
Adapted from De Lannoy et al. 
(2013) 
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interference. The figure shows that all three sets of the literature values for the RTM 
parameters lead to substantial biases against SMOS observations, with Litl being too cold 
(by 42.0 K on average) and Lit3 too warm (by 24.6 K on average). Even though Lit2 
estimates are nearly unbiased in the global average, there are still significant regional 
biases in the simulated Tbs, with an average absolute bias of 12.7 K. Since such biases 
would interfere with the assimilation of satellite Tb, the RTM parameters need to be 
calibrated to achieve climatologically unbiased Tb simulations. 

The most important RTM parameters determining h, x, and co have been calibrated, 
separately for each model grid cell, using multi-angular SMOS observations from July 1, 
2011 to July 1, 2012. The calibration simultaneously minimizes, separately for each 
location, the difference between the modeled and observed climatological mean values, the 
difference between modeled and observed climatological standard deviations, and the 
deviations of the optimized parameters from prior guesses (that is, from Litl, Lit2, or Lit3 
values). Through investigating a number of calibration scenarios, De Lannoy et al. (2013) 
determined that it is best to simultaneously calibrate a subset of the RTM parameters that 
most directly determine li, x, and co. 

After calibration, global Tb simulations for the validation year (July 1, 2010-July 1, 
2011) are largely unbiased for multiple incidence angles and both H- and V-polarization. 
Lor example, Lig. 7d shows that the global average absolute bias is now just 2.7 K for 
H-polarized Tb at 42.5° incidence angle. It should be emphasized that an RMSE of 
approximately 10 K remains, which is partly due to seasonal biases and partly due to 
random errors. The former will be addressed in the assimilation system through bias 
estimation and correction, and the latter through the radiance-based soil moisture analysis. 
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Fig. 7 Difference between 1-year (July 1, 2010-July 1, 2011) mean values of Tb H (42.5°) in Kelvin from 
GEOS-5 and SMOS observations for a Litl, b Lit2, c Lit3, and d calibrated parameters. Within each 
subplot, “avg(l.l)” indicates the average absolute difference across the globe (excluding regions impacted by 
open water or radio-frequency interference that are shown in white). Adapted from De Lannoy et al. (2013) 
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The calibrated parameters are shown in Fig. 6. Results suggest, for example, that the 
roughness parameter ( h ) is too low in Litl and too high in Lit3. The calibrated vegetation 
opacity (x) values distinguish clearly between high and low vegetation. The calibrated 
scattering albedo (co) is increased over low vegetation, which reduces the vegetation effect 
in the simulated Tb. In summary, the climatological calibration generates plausible 
parameter values that are consistent with the underlying land modeling system. 

3.3.2 Predicting Microwave Brightness Temperatures over Snow 

As demonstrated in the previous section, the Catchment model (as do similar global land 
surface models) supports the application of a physically based microwave RTM for warm- 
season processes. Flowever, the snow model components in global land surface models, 
including that in the Catchment model, are usually too simplistic to support physically 
based RTM modeling in the presence of snow. Specifically, global snow models lack 
reliable estimates of snow microphysical properties (such as grain size, ice layers, and 
depth hoar) which would be needed for physically based forward modeling of the 
microwave brightness temperatures. Forman et al. (2013) therefore constructed an 
empirical forward RTM for snow-covered land surfaces based on an Artificial Neural 
Network (ANN). 

The Catchment model state variables used as input to the ANN include the density and 
temperature of the snowpack at multiple depths, the temperature of the underlying soil, the 
overlying air, and the vegetative canopy, and the total amount of water equivalent within 
the snowpack. In addition, a cumulative temperature gradient index (TGI) is used as a 
proxy for snow grain size evolution in the presence of a vapor pressure gradient. Using the 
above inputs, the ANN is trained and (independently) validated using 10.7, 18.7, and 
36.5 GHz microwave brightness temperatures at H- and V-polarization from AMSR-E. 
The independent validation is accomplished as follows: From the 9-year AMSR-E data 
record, each single year is withheld in turn from the ANN training, and skill metrics for the 
resulting ANN predictions are computed only against the AMSR-E data that have been 
withheld from the ANN training. 

Figure 8 demonstrates the performance of the ANN predictions relative to AMSR-E 
measurements that were not used during training. The figure illustrates the overall ability 
of the ANN to predict Tbs for the 10 GHz V-polarized channel. The ANN predictions are 
essentially unbiased (relative to the AMSR-E measurements) across the 9-year period 
(Fig. 8a). The RMSE is typically less than 5 K (Fig. 8b). In addition, the ANN demon- 
strates skill in predicting interannual variability, with anomaly R values well above 0.5 
over large parts of North America (Fig. 8c). Relatively low skill can be seen in areas along 
the southern periphery, where the snowpack is relatively thin and ephemeral, as well as in 
areas north of the boreal forest, where sub-grid scale lake ice (which is not modeled in the 
land surface model) is common. In short, Fig. 8 suggests considerable skill by the ANN at 
predicting interannual variability in 10 GHz V-polarized Tbs across North America with 
negligible bias and a reasonable RMSE. The RMSE is somewhat higher but still reasonable 
(less than 10 K) for the higher frequencies and for H-polarization Tb (see Figures 4-6 of 
Forman et al. 2013). 

Forman et al. (2013) also assessed the potential for using the ANN as a forward 
observation operator in radiance-based snow assimilation. For this demonstration, the 
observations are considered to be in the form of spectral differences in V-polarization 
brightness temperatures, ATb = Tb v (18 GHz) — Tb v (36 GHz). Since ATb typically 
increases with increasing SWE, this spectral difference is commonly used to estimate SWE 
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Fig. 8 a Bias, b RMSE, and c anomaly R for ANN simulated 10 GHz V-polarized Tb from September 1, 
2002 to September 1, 2011 versus AMSR-E observations not used in training. Anomaly R values not 
statistically different from zero at the 95 % significance level based on a Fisher Z transform are shown in 
gray. Such non-significant R values occur in only a few very small regions 

in retrieval algorithms (Kelly 2009). For the demonstration of the radiance-based assim- 
ilation considered here, observations of ATb imply that the resulting Kalman gain is 
proportional to error correlations between modeled SWE and ANN predictions of ATb. To 
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obtain analysis increments, the Kalman gain would be multiplied with innovations in ATb 
(that is, the difference between actual AMSR-E observations of ATb and ANN predictions 
of ATb). 

The Kalman gain computed for February 6, 2003 ranges from —10 to 15 mm K“ 1 as 
illustrated in Fig. 9. A gain of 1 mm K 1 equates to an increase of 1 mm in the posterior 
(updated) modeled SWE for a 1 K innovation (that is, for a difference of 1 K between 
AMSR-E ATb measurements and ANN ATb predictions). Similarly, a negative Kalman 
gain in the presence of a positive-valued innovation would equate to a reduction in 
modeled SWE. Most importantly, the results suggest that there is a nonzero error corre- 
lation between the model SWE forecasts and the simulated ATb measurements across 
much of the North American domain. Overall, the results suggest that the ANN could serve 
as a computationally efficient observation operator for radiance-based snow data assimi- 
lation at the continental scale. 

3.4 Observation Selection for a Root Zone Soil Moisture Analysis 

Knowledge of the amount of moisture stored in the root zone of the soil is important for 
many applications related to the transfer of water, energy, and carbon between the land and 
the atmosphere, including the assessment, monitoring, and prediction of drought (Sene- 
viratne et al. 2010). At the global scale, soil moisture estimates are usually based on two 
sources of information: (1) direct observations of surface soil moisture from satellite and 
(2) observation-based precipitation forcing driving a numerical model of soil moisture 
dynamics. However, neither surface soil moisture retrievals nor precipitation observations 
provide direct measurements of soil moisture in the root zone. The selection of the most 
relevant types of observations for a root zone soil moisture analysis therefore presents an 
important conceptual problem. 

A priori, it is not obvious whether the estimation of root zone soil moisture would 
benefit more from the use of precipitation observations (as, for example, in the Global 
Land Data Assimilation System; Rodell et al. 2003) or from the assimilation of surface soil 
moisture retrievals (as, for example, illustrated by Reichle et al. 2007). This section pro- 
vides examples of both approaches. First, a land surface reanalysis that relies on observed 
precipitation is presented, followed by a root zone soil moisture analysis that is based on 
the assimilation of surface soil moisture retrievals. Finally, the two sources of soil moisture 


Fig. 9 Histogram of the Kalman 
gain on February 6, 2003 for 
SWE versus 

ATb = [Tb v (18 GHz) - 
Tb v (36 GHz)] 
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information are merged and compared directly in a single system, and their relative con- 
tributions to the skill of root zone soil moisture estimates are assessed. 

3.4.1 Using Precipitation Obsen>ations 

The Modern-Era Retrospective Analysis for Research and Applications (MERRA) is a 
state-of-the-art atmospheric reanalysis data product based on GEOS-5 that provides, in 
addition to atmospheric fields, global estimates of soil moisture, latent heat flux, snow, and 
runoff for 1979 — present with a latency of about 1 month (Rienecker et al. 2011). A 
supplemental and improved set of land surface hydrological fields (“MERRA-Land”) is 
generated routinely using an improved version of the land component of the MERRA 
system (Reichle et al. 2011; Reichle 2012). Specifically, the MERRA-Land estimates 
benefit from corrections to the MERRA precipitation forcing with the global gauge-based 
NOAA Climate Prediction Center “Unified” (CPCU) precipitation product and from 
revised parameter values in the rainfall interception model, changes that effectively correct 
for known limitations in the MERRA surface meteorological forcings. 

With a few exceptions, the MERRA-Land data appear more accurate than the original 
MERRA estimates and are thus recommended for those interested in using MERRA output 
for land surface hydrological studies. As an example, Fig. 10 examines the drought con- 
ditions experienced across the western United States and along the East Coast. The 
MERRA and MERRA-Land drought indicator shown in the figure is derived by ranking, 
separately for each grid cell, the normalized, monthly mean root zone soil moisture 
anomalies for June, July, and August of 1980 through 2011 and converting the rank into 
percentile units. For comparison, the drought severity assessed independently by U.S. 
Drought Monitor is also shown. The figure clearly demonstrates that MERRA-Land data 
are more consistent with the Drought Monitor than MERRA data. 

Reichle et al. (201 1) and Reichle (2012) provide a more comprehensive and quantitative 
analysis of the skill (defined as the correlation coefficient of the anomaly time series with 
independent observations) in land surface hydrological fields from MERRA, MERRA- 
Land, and the latest global atmospheric reanalysis produced by ECWMF (ERA-I; Dee 
et al. 2011). Figure 11 shows that MERRA-Land and ERA-I root zone soil moisture skills 
(against in situ observations at 85 US stations) are comparable and significantly greater 
than that of MERRA. Furthermore, the runoff skill (against naturalized stream flow 
observations from 18 US basins) of MERRA-Land is typically higher than that of MERRA 
and ERA-I (not shown). Throughout the northern hemisphere, MERRA and MERRA-Land 
agree reasonably well with in situ snow depth measurements (from 583 stations) and with 
SWE from an independent analysis (not shown). In summary, through observations-based 
corrections of the MERRA precipitation forcing, MERRA-Land provides a supplemental 
and significantly improved land surface reanalysis product. 

3.4.2 Assimilating Surface Soil Moisture Retrievals 

Satellite retrievals of surface soil moisture are not used in MERRA-Land but would almost 
certainly have further improved the skill of root zone soil moisture estimates. Draper et al. 
(2012) illustrate the potential gains from assimilating ASCAT (Bartalis et al. 2007; Wagner 
et al. 1999) and 10.7 GHz AMSR-E Land Parameter Retrieval Model (LPRM; de Jeu et al. 
2008) surface soil moisture retrievals. The retrievals are assimilated, both separately and 
jointly, over 3.5 years into the GEOS-5 LDAS, using MERRA forcing and initial condi- 
tions. Soil moisture skill is measured as the anomaly time series correlation coefficient 
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Drought Conditions in August 2002 
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(13 August 2002) 



b 



2 5 10 20 

Drought Indicator [Percentile] 


Drought Severity: 

DO Abnormally Dry 
D1 Drought - Moderate 
D2 Drought - Severe 
D3 Drought - Extreme 
D4 Drought - Exceptional 

http://droughtmonitor.unl.edu 


■ 


Fig. 10 Drought indicator derived from (top left ) MERRA and (bottom left) MERRA-Land root zone soil 
moisture estimates for August 2002. Darker colors indicate more severe drought conditions. MERRA-Land 
estimates are more consistent than MERRA estimates with an independent drought assessment from the US 
Drought Monitor for August 13, 2002 (right) 


Fig. 11 Skill (pentad anomaly Anomaly R v. SCAN in situ observations 

R\ dimensionless) of MERRA, 

MERRA-Land, and ERA-I 
estimates (2002-2009) versus 
SCAN in situ surface and root 
zone soil moisture measurements g gg 

at 85 stations. Error bars indicate 
approximate 95 % confidence 
intervals. Adapted from Reichle 
( 2012 ) 
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Fig. 12 Mean skill for root zone soil moisture from the open loop (ensemble mean, no assimilation), and 
the data assimilation (DA) of ASCAT, AMSR-E, and both surface soil moisture retrievals, averaged by land 
cover class, with 95 % confidence intervals. The number of sites in each land cover class is given in the axis 
labels. Skill is defined as the daily anomaly R value versus SCAN/SNOTEL and Murrumbidgee in situ 
observations. Adapted from Draper et al. (2012) 

( R ) with in situ soil moisture observations from the SCAN/SNOTEL network in the US (66 
sites) and the Murrumbidgee Soil Moisture Monitoring Network in Australia (19 sites). 
These 85 sites are surrounded by terrain with low topographic complexity based on data 
provided with the ASCAT observations. Averaged over these sites, the ASCAT and 
AMSR-E surface soil moisture retrievals have similar skill (Draper et al. 2012). 

Figure 12 shows the estimated R values and their 95 % confidence intervals for root 
zone soil moisture from the assimilation of ASCAT, AMSR-E, and both. The results are 
benchmarked against an open loop (no assimilation) model integration and have been 
averaged by land cover type (based on MODIS land cover classifications). Averaged across 
all 85 sites, assimilating ASCAT and/or AMSR-E surface soil moisture retrievals signif- 
icantly improved the root zone soil moisture skill (at the 5 % level). The mean skill was 
increased from 0.45 for the open loop, to 0.55 for the assimilation of ASCAT, 0.54 for the 
assimilation of AMSR-E, and 0.56 for the assimilation of both. 

Assimilating the ASCAT or AMSR-E retrievals also improved the mean R value over 
each individual land cover type, in most cases significantly. At the frequencies observed by 
AMSR-E and ASCAT, dense vegetation limits the accuracy of soil moisture observations, 
and so the improvements obtained over the mixed cover sites, which have 10-60 % trees or 
wooded vegetation, are very encouraging. For each land cover type, the skill obtained from 
the assimilation of ASCAT or AMSR-E retrievals was very similar. The combined 
assimilation of ASCAT and AMSR-E retrievals generally matched or slightly exceeded the 
mean R values from the single-sensor assimilation experiments. 

Draper et al. (2012) also examined the contribution of the model skill and the obser- 
vation skill to the skill of the assimilation estimates. The color surface in Fig. 13 shows the 
skill improvements (A R) in root zone soil moisture, where A R is defined as the skill (R) of 
the assimilation estimates (from the single-sensor assimilation of ASCAT or AMSR-E 
retrievals) minus that of the open loop model estimates. The skill improvements are shown 
as a function of the open loop model skill and the retrieval skill. Specifically, the ordinate 
measures the skill of the open loop root zone soil moisture estimates, and the abscissa 
measures the skill of the assimilated (ASCAT or AMSR-E) surface soil moisture retrievals. 
Where the skill of the assimilated retrievals is no more than 0.2 less than the open loop skill 
(below the dashed line), the assimilation improves the root zone soil moisture skill. The 
improvements increase (up to 0.4) as the observation skill increases relative to that of the 


1 


ft 


• OPEN 
▼ DA ASCAT 
▲ DA AMSR-E 
DA BOTH 


fit 


1 


f if 

I 


<£) Springer 


Author's personal copy 


Surv Geophys (2014) 35:577-606 


597 


Skill improvement of assimilation over 
open-loop for root-zone soil moisture 
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Fig. 13 Root zone soil moisture skill improvement (A R) from assimilating either ASCAT or AMSR-E 
surface soil moisture retrievals as a function of (ordinate) the open loop model skill and (abscissa) the 
observation skill. Skill improvement (A R) is defined as the skill of the assimilation product minus the open 
loop skill, with skill based only on days with data available from both satellites. Skill is assessed versus 
in situ measurements from the SCAN and Murrumbidgee networks. Significant improvements are found in 
the area below the dashed line where the skill of the retrievals may be lower than that of the open loop by up 
to 0.2. Adapted from Draper et al. (2012) 
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open loop (toward the bottom right hand corner). (The results are very similar if the 
ordinate measures surface soil moisture skill; not shown). Figure 13 thus provides a 
practical demonstration of the minimum skill required for soil moisture observations to be 
beneficial in a land data assimilation system and confirms the findings obtained by Reichle 
et al. (2008b) using synthetically generated observations. In summary, the assimilation of 
active or passive microwave data significantly improves the model root zone soil moisture 
estimates by a similar amount, even in cases where the assimilated surface soil moisture 
retrievals are less skillful than the open loop soil moisture estimates. 

3.4.3 Combining Precipitation Observations and Surface Soil Moisture Retrievals 

Liu et al. (2011a) used both precipitation observations and surface soil moisture retrievals 
within the GEOS-5 LDAS and investigated their relative contributions to the skill of root 
zone soil moisture estimates. Relative to baseline soil moisture estimates from MERRA, 
their study investigates soil moisture skill derived from (1) land model forcing corrections 
based on large-scale, gauge-, and satellite-based precipitation observations and (2) 
assimilation of surface soil moisture retrievals from AMSR-E. Three precipitation products 
were used (separately) to correct the MERRA precipitation toward gauge- and satellite- 
based observations: the NOAA Climate Prediction Center Merged Analysis of Precipita- 
tion (CMAP) pentad product (“standard” version), the Global Precipitation Climatology 
Project (GPCP) version 2.1 pentad product, and the NOAA Climate Prediction Center 
(CPC) daily unified precipitation analysis over the United States. 
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Two different surface soil moisture retrieval products were assimilated into the GEOS-5 
LDAS: (1) the operational NASA Level-2B AMSR-E “AE-Land” product (version V09) 
archived at the National Snow and Ice Data Center (NSIDC; Njoku et al. 2003) and (2) the 
AMSR-E LPRM product (de Jeu et al. 2008). Soil moisture skill is assessed using in situ 
observations in the continental United States at the 37 single-profile sites within the SCAN 
network for which skillful AMSR-E retrievals are available. As in Sect. 3.4.2, skill is 
assessed in terms of the anomaly time series correlation coefficient R. 

Figure 14 shows comparable average skill for surface soil moisture estimates from the 
two AMSR-E products and from the Catchment model with MERRA precipitation forcing 
without data assimilation. Consistent with the findings of Sect. 3.4.1, adding information 
from precipitation observations increases soil moisture skills for surface and root zone soil 
moisture. Consistent with the results of Sect. 3.4.2, assimilating satellite estimates of 
surface soil moisture also increases soil moisture skills, again for surface and root zone soil 
moisture. The salient result is that adding information from both sources (precipitation 
observations and surface soil moisture retrievals) increases soil moisture skills by almost 
the sum of the individual skill contributions, which demonstrates that precipitation cor- 
rections and assimilation of satellite soil moisture retrievals contribute important and 
largely independent amounts of information. 

Liu et al. (2011a) also repeated their skill analysis against measurements from four 
USDA Agricultural Research Service (“CalVal”) watersheds with high-quality distributed 
sensor networks that measure surface soil moisture at the scale of land model and satellite 
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Fig. 14 Skill (daily anomaly R; dimensionless) versus SCAN in situ soil moisture measurements for 
estimates from two AMSR-E retrieval datasets (NSIDC and LPRM), the Catchment model forced with four 
different precipitation datasets (MERRA, CMAP, GPCP, and CPC), and the corresponding data assimilation 
integrations ( red bars : DA/NSIDC and blue bars: DA/LPRM). Average is based on 37 SCAN sites for 
surface and 35 SCAN sites for root zone soil moisture. Error bars indicate approximate 95 % confidence. 
Adapted from Liu et al. (2011a) 
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estimates (Jackson et al. 2010). As expected, the skill of the satellite, model, and assim- 
ilation estimates is higher when it is assessed against the multi-sensor CalVal observations 
rather than against single-profile SCAN measurements (not shown). The relative skill 
contributions by precipitation corrections and soil moisture retrieval assimilation, however, 
remain unchanged (not shown). This corroborates the results shown in Fig. 14 which were 
obtained with a larger network of single-profile sensors. 

Taken together, the results of this section strongly suggest that future land surface 
reanalysis efforts would benefit from the use of both precipitation observations and satellite 
retrievals of surface soil moisture because both types of observations contribute significant 
and largely independent amounts of information to the skill of root zone soil moisture in 
the analysis. Moreover, both active and passive surface soil moisture retrievals should be 
assimilated for maximum coverage and accuracy. 


4 Summary and Discussion 

The present study discussed several conceptual challenges in land surface hydrological 
data assimilation as part of an effort toward improving our understanding of the Earth’s 
hydrological cycle (Trenberth and Asrar 2013). The challenges arise from a seeming 
mismatch between the assimilated observations and the water cycle variables of interest 
that can be overcome through the careful design of the assimilation system. This was 
illustrated with examples from recent research findings using the GEOS-5 LDAS. 

The first challenge is the use of coarse-scale satellite observations to estimate land 
surface fields at finer scales of interest. Such horizontal downscaling can be accomplished 
by using a fine-scale land surface model and by defining an observation operator that maps 
from the fine-scale model space to the space of the coarse-scale observations (Sect. 3.1). In 
the presence of larger-scale model error correlations, the assimilation system can also 
spread observational information to unobserved locations. 

The second challenge is the partitioning of satellite observations (such as TWS retri- 
evals) into their component variables. This partitioning can again be accomplished through 
an observation operator. In the case of TWS assimilation, the observation operator maps 
from the fine-scale model estimates of soil moisture and snow to basin-scale TWS (Sect. 
3.2). The observation operator therefore enables the computation of the observation-minus- 
forecast residuals (innovations). The observation operator is also needed for the compu- 
tation of the Kalman gain matrix that transforms the observation-space (coarse-scale TWS) 
innovations into the model-space (fine-scale soil moisture and snow) analysis increments. 

The third challenge is the development of microwave RTMs for use as observation 
operators in radiance-based data assimilation. Two examples were given. In the first 
example, a global microwave RTM for warm-season, L-band brightness temperatures was 
calibrated successfully using SMOS observations (Sect. 3.3.1). In the second example, an 
empirical approach based on an artificial neural network yielded robust model simulations 
of AMSR-E microwave brightness temperatures over snow-covered land at continental 
scales (Sect. 3.3.2). In both cases, the results are very encouraging and constitute progress 
toward replacing the commonly used assimilation of geophysical retrievals (such as SWE 
or surface soil moisture retrievals) with the direct assimilation of satellite radiances. Note 
that a radiance-based soil moisture analysis can partition the observational (brightness 
temperature) information into increments of model soil moisture, soil temperature, and 
vegetation water content (essentially, the model variables that most impact the brightness 
temperature). In other words, the microwave RTM, acting as the observation operator, 
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takes on a role that is conceptually similar to that of the observation operator used for the 
partitioning of TWS information into its water cycle components (Sect. 3.2). 

The fourth and final challenge addressed in the paper discusses the selection of the types 
of observations that are most relevant for the analysis of poorly observed variables. For the 
analysis of one such variable, root zone soil moisture, the use of gauge- and satellite-based 
precipitation observations along with active and passive surface soil moisture retrievals 
was investigated (Sect. 3.4). It was shown that the MERRA-Land surface reanalysis 
provides better estimates of root zone soil moisture than MERRA due to the use of gauge- 
based precipitation observations in MERRA-Land. Next, the potential skill gained from the 
assimilation of surface soil moisture retrievals was investigated. It was demonstrated that 
improved root zone soil moisture estimates can be obtained even where the skill of the 
assimilated surface soil moisture retrievals is somewhat poorer than that of the model 
estimates of surface soil moisture. For maximum coverage and accuracy, both active and 
passive retrievals should be assimilated. Finally, it was shown that the use of precipitation 
observations and the assimilation of surface soil moisture retrievals contribute significant 
and largely independent amounts of root zone soil moisture information. Therefore, future 
reanalyses should use both of these observation types. This finding is consistent with the 
general expectation that using more observations in a data assimilation system will 
improve its output. 

In some cases (for example, Sects. 3.1 and 3.2), the appropriate observation operator 
and assimilation system configuration entail that neighboring grid cells (or land model 
tiles) are no longer computationally independent in the assimilation system, even if they 
are independent in the land model (Reichle and Koster 2003). These computational 
dependencies arise through spatially correlated perturbation fields or spatially distributed 
analysis update calculations. Such “three-dimensional” land data assimilation systems 
therefore necessitate greater computational resources than more simplistic, “one-dimen- 
sional” assimilation systems where all model grid cells (or tiles) are treated independently. 
It is assumed here that the purely technical challenge of computational demand can be 
overcome with sophisticated software engineering and the increasing availability of 
affordable and massively parallel computing architectures. 


5 Conclusions and Outlook 

The present paper focused on the seeming mismatch between satellite observations and the 
water cycle variables of interest, and how a mismatch can be overcome through careful 
design and application of a land data assimilation system. Responding to the challenge 
questions of Sect. 1 , we find that, if designed properly, a land data assimilation system can 
enable 

1. the horizontal downscaling of coarse-scale satellite observations, 

2. the partitioning of vertically integrated satellite measurements such as TWS into their 
water cycle components, 

3. the direct assimilation of satellite radiances for soil moisture or snow analyses, and 

4. the propagation of information from observed fields such as precipitation and surface 
soil moisture into variables such as root zone soil moisture, that are of great interest 
but are not directly observed by satellites. 

Naturally, many challenges still lie ahead. State-of-the-art land data assimilation 
algorithms are only now emerging in operational systems. Much of the recent progress has 
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been achieved in so-called “off-line” (land-only) assimilation systems. These advances 
need to be incorporated into the coupled land-atmosphere systems used in atmospheric 
data assimilation and numerical weather prediction (NWP). Ground-breaking advances in 
coupled land-atmosphere data assimilation are being made, for example, at ECMWF (de 
Rosnay et al. 2012a, b). At the same time, the coupling of the GEOS-5 LDAS to the 
GEOS-5 atmospheric data assimilation system is underway at the NASA GMAO. 

Moreover, much of the progress in land data assimilation has been with systems that 
assimilate only one type of observation, often surface soil moisture. In future, more 
emphasis will need to be placed on the assimilation of multiple types of observations 
within a single assimilation system, including observations of water cycle components 
such as soil moisture, SWE, snow cover fraction, TWS, and precipitation. 

Future development should also address the addition or improvement of runoff routing 
and surface water storage model components in the global land surface models used in 
NWP. The planned NASA Surface Water and Ocean Topography (SWOT; Durand et al. 
2010) mission, for instance, will provide high-resolution observations of surface water 
elevation. To improve our understanding of the global hydrological cycle, it will be crucial 
to incorporate these new observations into global land data assimilation systems, building 
on early studies such as those by Andreadis et al. (2007), Biancamaria et al. (2011), and 
Durand et al. (2008). 

Finally, the existing global land data assimilation systems will need to consider the 
modeling of vegetation dynamics and the assimilation of current or planned satellite obser- 
vations such as the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), 
the Feaf Area Index (FAI), or the multi-angular Photochemical Reflectance Index (PRI) 
(Albergel et al. 2010; Hilker et al. 2012; Kaminski et al. 2012; Knorr et al. 2010; Munoz 
Sabater et al. 2008; Stockli et al. 2011). Furthermore, current microwave sensors already 
provide observations of the freeze-thaw state of the landscape at coarse scales (Kim et al. 
2010), and SMAP will provide much higher-resolution observations with continental cov- 
erage (Entekhabi et al. 2010). These vegetation and freeze-thaw observations link the 
hydrological and carbon cycles and should be used in global land data assimilation systems. 
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